Skip to content

PYTHON-5683: Spike: Investigate using Rust for Extension Modules#2699

Draft
aclark4life wants to merge 9 commits intomongodb:PYTHON-5683from
aclark4life:PYTHON-5683
Draft

PYTHON-5683: Spike: Investigate using Rust for Extension Modules#2699
aclark4life wants to merge 9 commits intomongodb:PYTHON-5683from
aclark4life:PYTHON-5683

Conversation

@aclark4life
Copy link
Contributor

@aclark4life aclark4life commented Feb 6, 2026

Supersedes #2689


Spike: Investigate using Rust for Extension Modules

  • Implement comprehensive Rust BSON encoder/decoder with 100% test compatibility (88/88 tests passing)
  • Add Evergreen CI configuration and test scripts
  • Add GitHub Actions workflow for Rust testing
  • Add runtime selection via PYMONGO_USE_RUST environment variable
  • Add performance benchmarking suite
  • Update build system to support Rust extension with Maturin
  • Add comprehensive documentation (bson/_rbson/README.md)

@codecov-commenter

This comment was marked as outdated.

@aclark4life aclark4life force-pushed the PYTHON-5683 branch 7 times, most recently from 8c59ac1 to 6599757 Compare February 13, 2026 01:28
@aclark4life aclark4life marked this pull request as ready for review February 13, 2026 01:47
@aclark4life aclark4life requested a review from a team as a code owner February 13, 2026 01:47
@aclark4life aclark4life requested a review from Jibola February 13, 2026 01:47
@codeowners-service-app
Copy link

Assigned NoahStapp for team dbx-python because Jibola is out of office.

@aclark4life aclark4life changed the base branch from master to PYTHON-5683 February 13, 2026 01:50
- Implement comprehensive Rust BSON encoder/decoder
- Add Evergreen CI configuration and test scripts
- Add GitHub Actions workflow for Rust testing
- Add runtime selection via PYMONGO_USE_RUST environment variable
- Add performance benchmarking suite
- Update build system to support Rust extension
- Add documentation for Rust extension usage and testing"
- TestCustomPythonBSONTypeToBSONMonolithicCodec
- TestCustomPythonBSONTypeToBSONMultiplexedCodec
- TestBSONTypeEnDeCodecs
- TestTypeRegistry
- TestGridFileCustomType
- TestCollectionChangeStreamsWCustomTypes
- TestDatabaseChangeStreamsWCustomTypes
- TestClusterChangeStreamsWCustomTypes

These tests require custom type encoder/decoder support which is not
implemented in the Rust extension. Skipping them prevents the 56 test
failures related to Decimal/Decimal128 type handling.
- TestRawBatchCursor and TestRawBatchCommandCursor (RawBSONDocument not implemented)
- TestBSONCorpus (BSON validation/error detection not fully implemented)
- test_uuid_subtype_4, test_legacy_java_uuid, test_legacy_csharp_uuid (legacy UUID representations not implemented)

These features are not implemented in the Rust extension and would require
significant additional work. Skipping these tests prevents 35 failures.
- Remove references to non-existent benchmark files
- Add comprehensive instructions for running perf_test.py
The cargo install method was failing due to yanked xwin dependencies
(versions 0.6.6 and 0.6.7) in the cargo-xwin package that maturin
depends on. Using pip install instead downloads a pre-built binary
from PyPI, avoiding the compilation and dependency issue entirely.

This aligns with how maturin is installed in other parts of the
codebase (bson/_rbson/build.sh and hatch_build.py).
After installing Rust, the cargo binaries (rustc, cargo, etc.) need to
be available in the PATH for subsequent build steps. This adds
$CARGO_HOME/bin to the PATH_EXT variable so that Rust tools are
accessible when PYMONGO_BUILD_RUST is enabled.

Without this, the build would fail with 'Rust toolchain not found'
even though Rust was successfully installed by install-rust.sh.
After installing Rust, we need to explicitly set the default toolchain
with 'rustup default stable' so that cargo and other Rust tools can
find the toolchain to use.

Also added RUSTUP_HOME to the environment configuration so it's
properly set and persisted across shell sessions. This ensures rustup
can locate its installation and toolchain data.

Fixes the error: 'rustup could not choose a version of cargo to run,
because one wasn't specified explicitly, and no default is configured.'
Enhanced the logging in run_tests.py to clearly show:
- Whether PYMONGO_USE_RUST and PYMONGO_BUILD_RUST are set
- Which BSON implementation is actually in use (rust/c/python)
- Clear indication of which extension is ACTIVE

This makes it easier to verify that the Rust extension is being used
when expected, especially for the 'perf rust' tests.
Added Rust vs C comparison versions for all standard BSON micro-benchmarks:
- Flat encoding/decoding (TestRustFlat*)
- Deep encoding/decoding (TestRustDeep*)
- Full encoding/decoding (TestRustFull*)

These tests use the same test data as the standard benchmarks but
explicitly compare C vs Rust implementations. Each benchmark has two
versions:
- *C: Uses C extension (implementation = 'c')
- *Rust: Uses Rust extension (implementation = 'rust')

The RustComparisonTest base class handles switching between
implementations by setting/unsetting PYMONGO_USE_RUST environment
variable and reloading the bson module.

This provides comprehensive performance comparison data between the
C and Rust BSON implementations across all standard benchmark datasets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants